Search CORE

8 research outputs found

Shape Modeling with Spline Partitions

Author: Elliott Lloyd
Ge Shufei
Wang Shijia
Publication venue
Publication date: 06/11/2022
Field of study

Shape modelling (with methods that output shapes) is a new and important task in Bayesian nonparametrics and bioinformatics. In this work, we focus on Bayesian nonparametric methods for capturing shapes by partitioning a space using curves. In related work, the classical Mondrian process is used to partition spaces recursively with axis-aligned cuts, and is widely applied in multi-dimensional and relational data. The Mondrian process outputs hyper-rectangles. Recently, the random tessellation process was introduced as a generalization of the Mondrian process, partitioning a domain with non-axis aligned cuts in an arbitrary dimensional space, and outputting polytopes. Motivated by these processes, in this work, we propose a novel parallelized Bayesian nonparametric approach to partition a domain with curves, enabling complex data-shapes to be acquired. We apply our method to HIV-1-infected human macrophage image dataset, and also simulated datasets sets to illustrate our approach. We compare to support vector machines, random forests and state-of-the-art computer vision methods such as simple linear iterative clustering super pixel image segmentation. We develop an R package that is available at \url{https://github.com/ShufeiGe/Shape-Modeling-with-Spline-Partitions}

arXiv.org e-Print Archive

Random Tessellation Forests

Author: Elliott Lloyd T.
Ge Shufei
Teh Yee Whye
Wang Liangliang
Wang Shijia
Publication venue
Publication date: 01/01/2019
Field of study

Space partitioning methods such as random forests and the Mondrian process are powerful machine learning methods for multi-dimensional and relational data, and are based on recursively cutting a domain. The flexibility of these methods is often limited by the requirement that the cuts be axis aligned. The Ostomachion process and the self-consistent binary space partitioning-tree process were recently introduced as generalizations of the Mondrian process for space partitioning with non-axis aligned cuts in the two dimensional plane. Motivated by the need for a multi-dimensional partitioning tree with non-axis aligned cuts, we propose the Random Tessellation Process (RTP), a framework that includes the Mondrian process and the binary space partitioning-tree process as special cases. We derive a sequential Monte Carlo algorithm for inference, and provide random forest methods. Our process is self-consistent and can relax axis-aligned constraints, allowing complex inter-dimensional dependence to be captured. We present a simulation study, and analyse gene expression data of brain tissue, showing improved accuracies over other methods.Comment: 11 pages, 4 figure

arXiv.org e-Print Archive

Oxford University Research Archive

Genome-Wide Association with Uncertainty in the Genetic Similarity Matrix

Author: Colijn Caroline
Elliott Lloyd T
Ge Shufei
Grandjean Louis
Sobkowiak Benjamin
Wang Liangliang
Wang Shijia
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 14/11/2022
Field of study

Genome-wide association studies (GWASs) are often confounded by population stratification and structure. Linear mixed models (LMMs) are a powerful class of methods for uncovering genetic effects, while controlling for such confounding. LMMs include random effects for a genetic similarity matrix, and they assume that a true genetic similarity matrix is known. However, uncertainty about the phylogenetic structure of a study population may degrade the quality of LMM results. This may happen in bacterial studies in which the number of samples or loci is small, or in studies with low-quality genotyping. In this study, we develop methods for linear mixed models in which the genetic similarity matrix is unknown and is derived from Markov chain Monte Carlo estimates of the phylogeny. We apply our model to a GWAS of multidrug resistance in tuberculosis, and illustrate our methods on simulated data

UCL Discovery

Statistical machine learning in computational genetics

Author: Ge Shufei
Publication venue
Publication date: 03/07/2020
Field of study

Statistical machine learning has played a key role in many areas, such as biology, health sciences, finance and genetics. Important tasks in computational genetics include disease prediction, capturing shapes within images, computation of genetic sharing between pairs of individuals, genome-wide association studies and image clustering. This thesis develops several learning methods to address these computational genetics problems. Firstly, motivated by the need for fast computation of genetic sharing among pairs of individuals, we propose the fastest algorithms for computing the kinship coefficient of a set of individuals with a known large pedigree. {Moreover, we consider the possibility that the founders of the known pedigree may themselves be inbred and compute the appropriate inbreeding-adjusted kinship coefficients, which has not been addressed in literature.} Secondly, motivated by an imaging genetics study of the Alzheimer\u27s Disease Neuroimaging Initiative, we develop a Bayesian bivariate spatial group lasso model for multivariate regression analysis applicable to exam the influence of genetic variation on brain structure and accommodate the correlation structures typically seen in structural brain imaging data. We develop a mean-field variational Bayes algorithm and a Gibbs sampling algorithm to fit the model. We also incorporate Bayesian false discovery rate procedures to select SNPs. The new spatial model demonstrates superior performance over a standard model in our application. Thirdly, we propose the Random Tessellation Process (RTP) to model complex genetic data structures to predict disease status. The RTP is a multi-dimensional partitioning tree with non-axis aligned cuts. We develop a sequential Monte Carlo (SMC) algorithm for inference. Our process is self-consistent and can relax axis-aligned constraints, allowing complex inter-dimensional dependence to be captured. Fourthly, we propose the Random Tessellation with Splines (RTS) to acquire complex shapes within images. The RTS provides a framework for describing Bayesian nonparametric models based on partitioning two-dimensional Euclidean space with splines. We also develop an inference algorithm that is "embarrassingly parallel". Finally, we extend the mixtures of spatial spline regression with mixed-effects model under the Bayesian framework to accommodate streaming image data. We propose an SMC algorithm to analyze online fashion brain image

Simon Fraser University Institutional Repository

Energy and Operation Characteristics of Electric Excavator With Innovative Hydraulic-Electric Dual Power Drive Boom System

Author: Bin Zhao
Lei Ge
Long Quan
Shufei Qiao
Yunxiao Hao
Zepeng Li
Publication venue: IEEE
Publication date: 01/01/2023
Field of study

In the existing electric excavators, the energy efficiency of the hydraulic system is less than 30% due to a large amount of throttling loss and waste of potential energy. In order to improve excavator energy efficiency, an electric excavator scheme using a hydraulic-electric dual-power drive boom system is proposed. A linear actuator, including electro-mechanical unit and hydraulic unit, was adopted in the boom system. The boom velocity is controlled by the electro-mechanical unit instead of hydraulic valve to reduce throttling loss. The non-rod chamber of the linear actuator is connected to a hydraulic accumulator to reutilize boom gravitational potential energy. In addition, when the boom and other devices are operated together, the throttling loss caused by load difference of multi-actuators can be reduced because the linear actuator can compensate for the pump pressure. The working principle, control strategy, and characteristics of the proposed electric excavator were analyzed through theory and experiments. The results show that the proposed system can reduce throttling loss and efficiently reutilize the boom gravitational potential energy. During boom lifting and lowering process, the reutilization rate of the boom gravitational potential energy is 67.6%, and the energy consumption is reduced by 66.1%. During land levelling process, the throttling loss of the electric excavator is reduced by 49.6% and the energy consumption is reduced by 38.1%. The research results will provide new methods for the electrification of construction machinery

Directory of Open Access Journals

Insights into the bacterial species and communities of a full-scale anaerobic/anoxic/oxic wastewater treatment plant by using third-generation sequencing

Author: Asnicar
Bin Ji
Bokulich
Cao
Catone
Daims
Ge
Hongjiao Song
Ji
Ji
Kong
Langille
Li
Lücker
Mardis
Maxam
Nakano
Qin
Sanger
Shufei Zhang
Singer
Song
Tian
Tribelli
Wang
Xuechun Zhang
Yan
Yang
Ye
Zehua Kong
Zhang
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Post-marketing safety surveillance and re-evaluation of Xueshuantong injection

Author: Chunxiao Li
D-Q Ren
E-W Baars
Fei Meng
FJ Wang
Ge Guan
GF He
Guanping Liu
HC Li
Hui Zhang
J Luo
JJ Jiang
Jun Yuan
Junhua Zhang
L Wang
L Zhang
Linyan Lv
Mingjun Zhu
Peng Zhou
Shufei Fu
Tao Xu
Weixia Li
Wu W. Feng
X Li
X Liao
Xiao Ling
XJ Guo
XL Li
XM Wang
Xuelin Li
Y Liu
Y Ren
Y Zhao
Y-Y Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref